Search Results

Documents authored by Ben-Nun, Stav


Document
Time-Space Tradeoffs for Finding a Long Common Substring

Authors: Stav Ben-Nun, Shay Golan, Tomasz Kociumaka, and Matan Kraus

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
We consider the problem of finding, given two documents of total length n, a longest string occurring as a substring of both documents. This problem, known as the Longest Common Substring (LCS) problem, has a classic 𝒪(n)-time solution dating back to the discovery of suffix trees (Weiner, 1973) and their efficient construction for integer alphabets (Farach-Colton, 1997). However, these solutions require Θ(n) space, which is prohibitive in many applications. To address this issue, Starikovskaya and Vildhøj (CPM 2013) showed that for n^{2/3} ≤ s ≤ n, the LCS problem can be solved in 𝒪(s) space and 𝒪̃(n²/s) time. Kociumaka et al. (ESA 2014) generalized this tradeoff to 1 ≤ s ≤ n, thus providing a smooth time-space tradeoff from constant to linear space. In this paper, we obtain a significant speed-up for instances where the length L of the sought LCS is large. For 1 ≤ s ≤ n, we show that the LCS problem can be solved in 𝒪(s) space and 𝒪̃(n²/(L⋅s) +n) time. The result is based on techniques originating from the LCS with Mismatches problem (Flouri et al., 2015; Charalampopoulos et al., CPM 2018), on space-efficient locally consistent parsing (Birenzwige et al., SODA 2020), and on the structure of maximal repetitions (runs) in the input documents.

Cite as

Stav Ben-Nun, Shay Golan, Tomasz Kociumaka, and Matan Kraus. Time-Space Tradeoffs for Finding a Long Common Substring. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 5:1-5:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{bennun_et_al:LIPIcs.CPM.2020.5,
  author =	{Ben-Nun, Stav and Golan, Shay and Kociumaka, Tomasz and Kraus, Matan},
  title =	{{Time-Space Tradeoffs for Finding a Long Common Substring}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{5:1--5:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.5},
  URN =		{urn:nbn:de:0030-drops-121302},
  doi =		{10.4230/LIPIcs.CPM.2020.5},
  annote =	{Keywords: longest common substring, time-space tradeoff, local consistency, periodicity}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail